Sorting 100 TB on Google Compute Engine
نویسندگان
چکیده
Google Compute Engine offers a high-performance, costeffective means for running I/O-intensive applications. This report details our experience running large-scale, highperformance sorting jobs on GCE. We run sort applications up to 100 TB in size on clusters of up to 299 VMs, and find that we are able to sort data at or near the hardware capabilities of the locally attached SSDs. In particular, we sort 100 TB on 296 VMs in 915 seconds at a cost of $154.78. We compare this result to our previous sorting experience on Amazon Elastic Compute Cloud and find that Google Compute Engine can deliver similar levels of performance. Although individual EC2 VMs have higher levels of performance than GCE VMs, permitting significantly smaller cluster sizes on EC2, we find that the total dollar cost that the user pays on GCE is 48% less than the cost of running on EC2.
منابع مشابه
A Day Late and a Dollar Short: The Case for Research on Cloud Billing Systems
Cloud computing platforms such as Amazon Web Services, Google Compute Engine, and Rackspace Public Cloud have been the subject of numerous measurement studies considering performance, reliability, and cost efficiency. However, little attention has been paid to billing. Cloud providers rely upon complex, large-scale billing systems that track customer resource usage at fine granularity and gener...
متن کاملCo-location Detection on the Cloud
In this work we focus on the problem of co-location as a first step of conducting Cross-VM attacks such as Prime and Probe or Flush+Reload in commercial clouds. We demonstrate and compare three co-location detection methods namely, cooperative Last-Level Cache (LLC) covert channel, software profiling on the LLC and memory bus locking. We conduct our experiments on three commercial clouds, Amazo...
متن کاملImplementation of a Handheld Compute Engine for Personal Health Devices
In this paper, we propose a handheld Compute Engine (CE) for personal health devices (PHDs). The CE is a device that receives measurements from more than one PHD, and collects, analyzes and displays the received measurements. It also transfers the collected measurements to a remote monitoring server for more informative analysis. We implement the CE on a handheld device (i.e., smartphone) to pr...
متن کاملBuilding IAMC: A Layered Approach
Internet Accessible Mathematical Computation (IAMC) is a distributed system to make mathematical computation easily and widely available on the Web/Internet. The architecture and implementation of a framework for building IAMC systems are presented. Protocol layers, allowing any well-de ned encodings for mathematical data, connect the IAMC client (Icl) and server (Isv). An external engine inter...
متن کامل